UXO: An XML-Based Extensible Annotation Editor

نویسنده

  • Jan-Torsten Milde
چکیده

The integrated editor UXO is the result of ongoing research and development of the text-technology group at Bielefeld. Being a full featured XML-based editing system, it also allows to combine the structured annotated data with information imported from relational databases by integrating a JDBC interface. The mapping processes between different levels of annotation can be programmed either by the integrated scheme interpreter, or by extending the functionality of UXO using the predefined Java API. 14.1. System Architecture The integrated editor UXO is the result of ongoing research and development of the text-technology group at Bielefeld. The system is an integrated XML-based text editor, which is configurable to a large degree and can so be easily adapted to specific user needs. The editor has been implemented in Java making it possible to execute it on a large number of platforms. UXO allows to enter the data either by typing in text or by working directly on the displayed structure tree. The structure can be validated by starting the built-in XML parser (XML4J, IBM now Xerces). In contrast to standard XML tools available, UXO offers an integrated interface to relational databases in combination with a built-in interpreter for the scheme programming language, used as the systems scripting language. This combination allows to configure the editor for a wide range of varying applications simply by defining control/configuration scripts. As these scripts are external to the editor, reconfiguration does not mean to recompile the system. More experienced programmers are able to extend the permanent features of UXO using the powerful API. In principle this API allows to integrate any Java functionality available (see also Reinsch and Milde, 2000, Milde, 1999). The editor manages the full Unicode character set (Unicode, 1996). Its graphical user interface can be completely reconfigured allowing to localize the software and define appropriate control key sequences. Fig. 14.1 shows the basic system architecture. Internally, all data is handled as a DOM instance. It is possible to map database requests (via JDBC, see Klute, 1998), servlet requests (via HTTP) and XML documents onto this model. The editor allows to modify the content of a document and to validate its structure. ∗ Published in: Proceedings of the GLDV-Spring Meeting 2001, Henning Lobin (ed.), Giessen University, March 28th– 30th, 2001, pp. 151–159. http://www.uni-giessen.de/fb09/ascl/gldv2001/

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Universal XML Organizer: UXO

The integrated editor UXO is the result of ongoing research and development of the text-technology group at Bielefeld. Being a full featured XML-based editing system, it also allows to combine the structured annotated data with information imported from relational databases by integrating a JDBC interface. The mapping processes between different levels of annotation can be programmed either by ...

متن کامل

A Toolkit for an Oral History Digital Archive

In this work we propose an XML based toolkit for the construction of an oral history digital archive that allows the filing, classification and annotation of multimedia resources associated to a corpus of interviews. We describe the general organization of the archive and focus on content creation tools. In particular, we present a document editor for the classification and annotation of interv...

متن کامل

GitDOX: A Linked Version Controlled Online XML Editor for Manuscript Transcription

In this paper we present GitDOX, an open source online, schema aware XML annotation interface linked to Natural Language Processing tools, which uses the online GitHub platform as a backend for version controlled electronic corpus development. We apply this platform to the use case of transcribing and annotating Coptic manuscript data from first millennium Egypt, in a collaborative team. The ar...

متن کامل

Empirical Evaluation of Semi-Automated XML Annotation of Text Documents with the GoldenGATE Editor1

Digitized scientific documents should be marked up according to domain-specific XML schemas, to make maximum use of their content. Such markup allows for advanced, semantics-based access to the document collection. Many NLP applications have been developed to support automated annotation. But NLP results often are not accurate enough; and manual corrections are indispensable. We therefore have ...

متن کامل

Guide to Annotation

A review of multimedia annotation techniques, in particular image annotation, is presented. The annotation requirements for the Benchmarking workpackage of the MUSCLE EU Network of Excellence are also presented and discussed. A significant contribution is the creation of a keyword vocabulary based on an analysis of keywords used in experiments for testing automated image annotation algorithms a...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001